Heterodimeric Meganucleases and Use Thereof

ABSTRACT

Heterodimeric meganuclease comprising two domains of different meganucleases which are in two separate polypeptides, said heterodimeric meganuclease being able to cleave a chimeric DNA target sequence comprising one different half of each parent meganuclease DNA target sequence. 
     Use of said herodimeric meganuclease and derived products for genetic engineering, genome therapy and antiviral therapy.

The invention relates to an heterodimeric meganuclease comprising twodomains of different meganucleases which are in two separatepolypeptides, said heterodimeric meganuclease being able to cleave achimeric DNA target sequence comprising one different half of eachparent meganuclease DNA target sequence.

The invention relates also to a vector encoding said heterodimericmeganuclease, to a cell, an animal or a plant modified by said vectorand to the use of said herodimeric meganuclease and derived products forgenetic engineering, genome therapy and antiviral therapy.

Meganucleases are by definition sequence-specific endonucleases withlarge (>12 bp) cleavage sites and they can be used to achieve very highlevels of gene targeting efficiencies in mammalian cells and plants(Rouet et al., Mol. Cell. Biol., 1994, 14, 8096-106; Choulika et al.,Mol. Cell. Biol., 1995, 15, 1968-73; Donoho et al., Mol. Cell. Biol,1998, 18, 4070-8; Elliott et al., Mol. Cell. Biol., 1998, 18, 93-101;Sargent et al., Mol. Cell. Biol., 1997, 17, 267-77; Puchta et al., Proc.Natl. Acad. Sci. USA, 1996, 93, 5055-60), making meganuclease-inducedrecombination an efficient and robust method for genome engineering. Themajor limitation of the current technology is the requirement for theprior introduction of a meganuclease cleavage site in the locus ofinterest. Thus, the generation of novel meganucleases with tailoredspecificities is under intense investigation. Such proteins could beused to cleave genuine chromosomal sequences and open a wide range ofapplications, including the correction of mutations responsible forinherited monogenic diseases.

Recently, fusion of Cys2-His2 type Zinc-Finger Proteins (ZFP) with thecatalytic domain of the FokI nuclease were used to make functionalsequence-specific endonucleases (Smith et al., Nucleic Acids Res, 1999,27, 674-81; Urnov et al., Nature, 2005, 435, 646-651). The bindingspecificity of ZFPs is relatively easy to manipulate, and a repertoireof novel artificial ZFPs, able to bind many (g/a)nn(g/a)nn(g/a)nnsequences is now available (Pabo et al., Annu. Rev. Biochem, 2001, 70,313-40; Segal and Barbas, Curr. Opin. Biotechnol., 2001, 12, 632-7;Isalan et al., Nat. Biotechnol., 2001, 19, 656-60). Nevertheless,preserving a very narrow specificity is one of the major issues forgenome engineering applications, and presently it is unclear whetherZFPs would fulfill the very strict requirements for therapeuticapplications.

Homing Endonucleases (HEs) are a widespread family of naturalmeganucleases including hundreds of proteins (Chevalier and Stoddard,Nucleic Acids Res., 2001, 29, 3757-74). These proteins are encoded bymobile genetic elements which propagate by a process called “homing”:the endonuclease cleaves a cognate allele from which the mobile elementis absent, thereby stimulating a homologous recombination event thatduplicates the mobile DNA into the recipient locus (Kostriken et al.,Cell; 1983, 35, 167-74; Jacquier and Dujon, Cell, 1985, 41, 383-94).Given their natural function and their exceptional cleavage propertiesin terms of efficacy and specificity, HEs provide ideal scaffolds toderive novel endonucleases for genome engineering. Data have beenaccumulated over the last decade, characterizating the LAGLIDADG family,the largest of the four HE families (Chevalier and Stoddard, precited).LAGLIDADG refers to the only sequence actually conserved throughout thefamily and is found in one or (more often) two copies in the protein.Proteins with a single motif, such as I-CreI, form homodimers and cleavepalindromic or pseudo-palindromic DNA sequences (FIG. 1), whereas thelarger, double motif proteins, such as I-SceI are monomers and cleavenon palindromic targets. Seven different LAGLIDADG proteins have beencrystallized, and they exhibit a very striking conservation of the corestructure, that contrasts with the lack of similarity at the primarysequence level (Jurica et al., Mol. Cell., 1998, 2, 469-76; Chevalier etal., Nat. Struct. Biol., 2001, 8, 312-6; Chevalier et al. J. Mol. Biol.,2003, 329, 253-69; Moure et al., J. Mol. Biol, 2003, 334, 685-95; Moureet al., Nat. Struct. Biol., 2002, 9, 764-70; Ichiyanagi et al., J. Mol.Biol., 2000, 300, 889-901; Duan et al., Cell, 1997, 89, 555-64; Bolducet al., Genes Dev., 2003, 17, 2875-88; Silva et al., J. Mol. Biol.,1999, 286, 1123-36). In this core structure, two characteristic αββαββαfolds, also called LAGLIDADG Homing Endonuclease Core Domains,contributed by two monomers, or by two domains in double LAGLIDAGproteins, are facing each other with a two-fold symmetry. DNA bindingdepends on the four β strands from each domain, folded into anantiparallel β-sheet, and forming a saddle on the DNA helix majorgroove. Analysis of I-CreI structure bound to its natural target showsthat in each monomer, eight residues establish direct interactions withseven bases (Jurica et al., 1998, precited). Residues Q44, R68 and R70contact three consecutive base pairs at positions 3 to 5 and −3 to −5(FIG. 1). The catalytic core is central, with a contribution of bothsymmetric monomers/domains. In addition to this core structure, otherdomains can be found: for example, PI-SceI, an intein, has a proteinsplicing domain, and an additional DNA-binding domain (Moure et al.,2002, precited; Grindl et al., Nucleic Acids Res., 1998, 26, 1857-62).

Two approaches have been used to derive novel endonucleases with newspecificities, from Homing Endonucleases:

protein variants

Seligman and co-workers used a rational approach to substitute specificindividual residues of the I-CreI αββαββα fold (Sussman et al., J. Mol.Biol., 2004, 342, 31-41; Seligman et al., Genetics, 1997, 147, 1653-64);substantial cleavage of novel targets was observed but for few I-CreIvariant only.

In a similar way, Gimble et al. modified the additional DNA bindingdomain of PI-SceI (J. Mol. Biol., 2003, 334, 993-1008); they obtainedvariant protein with altered binding specificity but no alteredspecificity and most of the proteins maintained a lot of affinity forthe wild-type target sequence.

hybrid or chimeric single-chain proteins

New meganucleases could be obtained by swapping LAGLIDADG HomingEndonuclease Core Domains of different monomers (Epinat et al., NucleicAcids Res., 2003, 31, 2952-62; Chevalier et al., Mol. Cell., 2002, 10,895-905; Steuer et al., Chembiochem., 2004, 5, 206-13; International PCTApplications WO 03/078619 et WO 2004/031346). These single-chainchimeric meganucleases wherein the two LAGLIDADG Homing EndonucleaseCore Domains from different meganucleases are linked by a spacer, wereable to cleave the hybrid target corresponding to the fusion of the twohalf parent DNA target sequences.

By coexpressing two domains from different meganucleases, the inventorshave engineered functional heterodimeric meganucleases, which are ableto cleave chimeric targets. This new approach, which can be applied toany meganuclease (monomer with two domains or homodimer), including thevariants derived from wild-type meganucleases, considerably enriches thenumber of DNA sequences that can be targeted, resulting in thegeneration of dedicated meganucleases able to cleave sequences from manygenes of interest. Potential applications include the cleavage of viralgenomes specifically or the correction of genetic defects viadouble-strand break induced recombination, both of which lead totherapeutics.

Therefore, the invention concerns a heterodimeric meganucleasecomprising two domains of different meganucleases (parentmeganucleases), wherein said domains are in two separate polypeptideswhich are able to assemble and to cleave a chimeric DNA target sequencecomprising one different half of each parent meganuclease DNA targetsequence.

As opposed to the hybrid or chimeric meganucleases wherein the twomeganuclease subunits which interact with a different half of ameganuclease target sequence, are in a single polypeptide, in theheterodimeric meganuclease according to the invention, each subunit isexpressed from a separate polypeptide. The two poly-peptides which aredifferent and originate from different meganucleases assemble to form afunctional heterodimeric meganuclease.

Definitions

Amino acid residues in a polypeptide sequence are designated hereinaccording to the one-letter code, in which, for example, Q means Gln orGlutamine residue, R means Arg or Arginine residue and D means Asp orAspartic acid residue.

Nucleotides are designated as follows: one-letter code is used fordesignating the base of a nucleoside: a is adenine, t is thymine, c iscytosine, and g is guanine. For the degenerated nucleotides, rrepresents g or a (purine nucleotides), k represents g or t, srepresents g or c, w represents a or t, m represents a or c, yrepresents t or c (pyrimidine nucleotides), d represents g, a or t, vrepresents g, a or c, b represents g, t or c, h represents a, t or c,and n represents g, a, t or c.

by “meganuclease”, is intended an endonuclease having a double-strandedDNA target sequence of 14 to 40 pb. Said meganuclease is either adimeric enzyme, wherein each domain is on a monomer or a monomericenzyme comprising the two domains on a single polypeptide.

by “meganuclease domain” is intended the region which interacts with onehalf of the DNA target of a meganuclease and is able to associate withthe other domain of the same meganuclease which interacts with the otherhalf of the DNA target to form a functional meganuclease able to cleavesaid DNA target.

by “meganuclease variant” is intented a meganuclease obtained byreplacement of at least one residue in the amino acid sequence of thewild-type meganuclease (natural meganuclease) with a different aminoacid.

by “functional variant” is intended a variant which is able to cleave aDNA target sequence, preferably said target is a new target which is notcleaved by the parent meganuclease.

by “LAGLIDADG Homing Endonuclease Core Domain”, is intended thecharacteristic αββαββα fold of the homing endonuclease of the LAGLIDADGfamily, corresponding to a sequence of about one hundred amino acidresidues. For example, in the case of the dimeric homing endonucleaseI-CreI (163 amino acids), the LAGLIDADG Homing Endonuclease Core Domaincorresponds to the residues 6 to 94. In the case of monomeric homingendonuclease, two such domains are found in the sequence of theendonuclease; for example in I-DmoI (194 amino acids), the first domain(residues 7 to 99) and the second domain (residues 104 to 194) areseparated by a short linker (residues 100 to 103).

by “DNA target sequence”, “DNA target”, “target sequence”, “target”,“recognition site”, “recognition sequence”, “homing recognition site”,“homing site”, “cleavage site” is intended a 14 to 40 bp double-strandedpalindromic, partially palindromic (pseudo-palindromic) ornon-palindromic polynucleotide sequence that is recognized and cleavedby a meganuclease. These terms refer to a distinct DNA location,preferably a genomic location, at which a double stranded break(cleavage) is to be induced by the meganuclease. The DNA target isdefined by the 5′ to 3′ sequence of one strand of the double-strandedpolynucleotide. For example, the palindromic DNA target sequence cleavedby wild-type I-CreI presented in FIG. 1 is defined by the sequence5′-t⁻¹²c⁻¹¹a⁻¹⁰a⁻⁹a⁻⁸a⁻⁷c⁻⁶g⁻⁵t⁻⁴c⁻³g⁻²t⁻¹a₊₁c₊₂g₊₃a₊₄c₊₅g₊₆t₊₇t₊₈t₊₉t₊₁₀g₊₁₁a₊₁₂(SEQ ID NO:1), wherein the bases interacting with R68, Q44 and R70 arefrom positions −5 to −3 and +5 to +3.

by “chimeric DNA target” or “hybrid DNA target” is intended the fusionof a different half of each parent meganuclease DNA target sequence.

by “vector” is intended a nucleic acid molecule capable of transportinganother nucleic acid to which it has been linked.

by “homologous” is intended a sequence with enough identity to anotherone to lead to a homologous recombination between sequences, moreparticularly having at least 95% identity, preferably 97% identity andmore preferably 99%.

“Identity” refers to sequence identity between two nucleic acidmolecules or polypeptides. Identity can be determined by comparing aposition in each sequence which may be aligned for purposes ofcomparison. When a position in the compared sequence is occupied by thesame base, then the molecules are identical at that position. A degreeof similarity or identity between nucleic acid or amino acid sequencesis a function of the number of identical or matching nucleotides atpositions shared by the nucleic acid sequences. Various alignmentalgorithms and/or programs may be used to calculate the identity betweentwo sequences, including FASTA, or BLAST which are available as a partof the GCG sequence analysis package (University of Wisconsin, Madison,Wis.), and can be used with, e.g., default settings.

The polypeptides forming the heterodimeric meganuclease of the inventionmay derive from a natural (wild-type) meganuclease or a functionalvariant thereof.

Preferred variants are variants having a modified specificity, ievariants able to cleave a DNA target sequence which is not cleaved bythe wild-type meganuclease. For example, such variants may have aminoacid variation at positions contacting the DNA target sequence orinteracting directly or indirectly with said DNA target.

The polypeptides forming the heterodimeric meganuclease of the inventionmay comprise, consist essentially of or consist of, one domain asdefined above. In the case of dimeric meganuclease, said polypeptide mayconsist of the entire open reading frame of the meganuclease(full-length amino acid sequence).

Said peptides may include one or more residues inserted at the NH₂terminus and/or COOH terminus of said domain. For example, a methionineresidue is introduced at the NH₂ terminus, a tag (epitope orpolyhistidine sequence) is introduced at the NH₂ terminus and/or COOHterminus; said tag is useful for the detection and/or the purificationof said polypeptide.

The cleavage activity of the heterodimeric meganuclease of the inventionmay be measured by a direct repeat recombination assay, in yeast ormammalian cells, using a reporter vector, as described in the PCTApplication WO 2004/067736. The reporter vector comprises two truncated,non-functional copies of a reporter gene (direct repeats) and a chimericDNA target sequence within the intervening sequence, cloned in a yeastor a mammalian expression vector (FIG. 2). The chimeric DNA targetsequence is made of one different half of each parent meganuclease (FIG.5). Coexpression of the two polypeptides results in the assembly of afunctional heterodimer which is able to cleave the chimeric DNA targetsequence. This cleavage induces homologous recombination between thedirect repeats, resulting in a functional reporter gene, whoseexpression can be monitored by appropriate assay.

According to an advantageous embodiment of said heterodimericmeganuclease, each polypeptide comprises the LAGLIDADG HomingEndonuclease Core Domain of a different LAGLIDADG homing endonuclease ora variant thereof; said LAGLIDADG homing endonuclease may be either ahomodimeric enzyme such as I-CreI, or a monomeric enzyme such as I-DmoI.

The LAGLIDADG homing endonuclease may be selected from the groupconsisting of: I-SceI, I-ChuI, I-CreI, I-CsmI, PI-SceI, PI-TliI,PI-MtuI, I-CeuI, I-SceII, I-Sce III, HO, PI-CivI, PI-CtrI, PI-AaeI,PI-BsuI, PI-DhaI, PI-DraI, PI-MavI, PI-MchI, PI-MfuI, PI-MflI, PI-MgaI,PI-MgoI, PI-MinI, PI-MkaI, PI-MleI, PI-MmaI, PI-MshI, PI-MsmI, PI-MthI,PI-MtuI, PI-MxeI, PI-NpuI, PI-PfuI, PI-RmaI, PI-SpbI, PI-SspI, PI-FacI,PI-MjaI, PI-PhoI, PI-TagI, PI-ThyI, PI-TkI, PI-TspI, I-MsoI, and I-AniI;preferably, I-CreI, I-SceI, I-ChuI, I-DmoI, I-CsmI, PI-SceI, PI-PfuI,PI-TliI, PI-MtuI, and I-CeuI; more preferably, I-CreI, I-MsoI, I-SceI,I-AniI, I-DmoI, PI-SceI, and PI-PfuI; still more preferably I-CreI.

In a preferred embodiment, one of the polypeptide comprises theLAGLIDADG Homing Endonuclease Core Domain of an I-CreI variant having atleast one substitution in positions 44, 68, and/or 70 of I-CreI, byreference to the amino acid numbering of the I-CreI sequence SWISSPROTP05725.

Said polypeptide may for example consist of the entire open readingframe of said I-CreI variant.

In a more preferred embodiment, said residues in positions 44, 68,and/or 70 of I-CreI are replaced with an amino acid selected in thegroup consisting of: A, D, E, G, H, K, N, P, Q, R, S, T, and Y.

In another more preferred embodiment, said I-CreI variant furthercomprises the mutation of the aspartic acid in position 75, in anuncharged amino acid, preferably an asparagine (D75N) or a valine(D75V).

In another more preferred embodiment, said heterodimeric LAGLIDADGhoming endonucleases comprising two polypeptides derived from I-CreIand/or I-CreI variant(s) having at least one substitution in positions44, 68, and/or 70 of I-CreI, cleaves a chimeric DNA target comprisingthe sequence:c⁻¹¹a⁻¹⁰a⁻⁹a⁻⁸a⁻⁷c⁻⁶n⁻⁵n⁻⁴n⁻³n⁻²n⁻¹n₊₁n₊₂n₊₃n₊₄n₊₅g₊₆t₊₇t₊₈t₊₉t₊₁₀g₊₁₁,wherein n is a, t, c, or g (SEQ ID NO: 2).

More preferably, for cleaving a chimeric DNA target, wherein n⁻⁴ is t orn₊₄ is a, one of the polypeptide has a glutamine (Q) in position 44.

More preferably, for cleaving a chimeric DNA target, wherein n⁻⁴ is a orn₊₄ is t, one of the polypeptide has an alanine (A) or an asparagine inposition 44; the I-CreI variants A44, R68, S70 and A44, R68, S70, N75are examples of such a polypeptide.

More preferably, for cleaving a chimeric DNA target, wherein n⁻⁴ is c orn₊₄ is g, one of the polypeptide has a lysine (K) in position 44; theI-CreI variants K44, R68, E70 and K44, R68, E70, N75 are examples ofsuch a polypeptide.

The subject-matter of the present invention is also a recombinant vectorcomprising two polynucleotide fragments, each encoding a differentpolypeptide as defined above.

One type of preferred vector is an episome, i.e., a nucleic acid capableof extra-chromosomal replication. Preferred vectors are those capable ofautonomous replication and/or expression of nucleic acids to which theyare linked. Vectors capable of directing the expression of genes towhich they are operatively linked are referred to herein as “expressionvectors”.

A vector according to the present invention comprises, but is notlimited to, a YAC (yeast artificial chromosome), a BAC (bacterialartificial), a baculovirus vector, a phage, a phagemid, a cosmid, aviral vector, a plasmid, a RNA vector or a linear or circular DNA or RNAmolecule which may consist of chromosomal, non chromosomal,semi-synthetic or synthetic DNA. In general, expression vectors ofutility in recombinant DNA techniques are often in the form of“plasmids” which refer generally to circular double-stranded DNA loopswhich, in their vector form are not bound to the chromosome. Largenumbers of suitable vectors are known to those of skill in the art.

Viral vectors include retrovirus, adenovirus, parvovirus (e.g.adeno-associated viruses), coronavirus, negative strand RNA viruses suchas orthomyxovirus (e.g., influenza virus), rhabdovirus (e.g., rabies andvesicular stomatitis virus), para-myxovirus (e.g. measles and Sendai),positive strand RNA viruses such as picor-navirus and alphavirus, anddouble-stranded DNA viruses including adenovirus, herpesvirus (e.g.,Herpes Simplex virus types 1 and 2, Epstein-Barr virus,cytomega-lovirus), and poxvirus (e.g., vaccinia, fowlpox and canarypox).Other viruses include Norwalk virus, togavirus, flavivirus, reoviruses,papovavirus, hepadnavirus, and hepatitis virus, for example.

Vectors can comprise selectable markers, for example: neomycinphosphotransferase, histidinol dehydrogenase, dihydrofolate reductase,hygromycin phosphotransferase, herpes simplex virus thymidine kinase,adenosine deaminase, glutamine synthetase, and hypoxanthine-guaninephosphoribosyl transferase for eukaryotic cell culture; TRP1 for S.cerevisiae; tetracycline, rifampicin or ampicillin resistance in E.coli.

Preferably said vectors are expression vectors, wherein the sequencesencoding the polypeptides of the invention are placed under control ofappropriate transcriptional and translational control elements to permitproduction or synthesis of said polypeptides. Therefore, saidpolynucleotides are comprised in expression cassette(s). Moreparticularly, the vector comprises a replication origin, a promoteroperatively linked to said encoding polynucleotide, a ribosome bindingsite, an RNA-splicing site (when genomic DNA is used), a polyadenylationsite and a transcription termination site. It also can comprise anenhancer. Selection of the promoter will depend upon the cell in whichthe polypeptide is expressed.

According to another advantageous embodiment of said vector, it includesa targeting construct comprising sequences sharing homologies with theregion surrounding the chimeric DNA target sequence as defined above.

More preferably, said targeting DNA construct comprises:

a) sequences sharing homologies with the region surrounding the chimericDNA target sequence as defined above, and

b) sequences to be introduced flanked by sequence as in a).

The invention also concerns a prokaryotic or eukaryotic host cell whichis modified by two polynucleotides or a vector as defined above,preferably an expression vector.

The invention also concerns a non-human transgenic animal or atransgenic plant, characterized in that all or part of their cells aremodified by two polynucleotides or a vector as defined above.

As used herein, a cell refers to a prokaryotic cell, such as a bacterialcell, or eukaryotic cell, such as an animal, plant or yeast cell.

The polynucleotide sequences encoding the polypeptides as defined in thepresent invention may be prepared by any method known by the man skilledin the art. For example, they are amplified from a cDNA template, bypolymerase chain reaction with specific primers. Preferably the codonsof said cDNA are chosen to favour the expression of said protein in thedesired expression system.

The recombinant vector comprising said polynucleotides may be obtainedand introduced in a host cell by the well-known recombinant DNA andgenetic engineering techniques.

The heterodimeric meganuclease of the invention is produced byexpressing the two polypeptides as defined above; preferably saidpolypeptides are co-expressed in a host cell modified by two expressionvectors, each comprising a polynucleotide fragment encoding a differentpolypeptide as defined above or by a dual expression vector comprisingboth polynucleotide fragments as defined above, under conditionssuitable for the co-expression of the polypeptides, and theheterodimeric meganuclease is recovered from the host cell culture.

The subject-matter of the present invention is further the use of aheterodimeric meganuclease, two polynucleotides, preferably bothincluded in one expression vector (dual expression vector) or eachincluded in a different expression vector, a dual expression vector, acell, a transgenic plant, a non-human transgenic mammal, as definedabove, for molecular biology, for in vivo or in vitro geneticengineering, and for in vivo or in vitro genome engineering, fornon-therapeutic purposes.

Non therapeutic purposes include for example (i) gene targeting ofspecific loci in cell packaging lines for protein production, (ii) genetargeting of specific loci in crop plants, for strain improvements andmetabolic engineering, (iii) targeted recombination for the removal ofmarkers in genetically modified crop plants, (iv) targeted recombinationfor the removal of markers in genetically modified microorganism strains(for antibiotic production for example).

According to an advantageous embodiment of said use, it is for inducinga double-strand break in a site of interest comprising a chimeric DNAtarget sequence, thereby inducing a DNA recombination event, a DNA lossor cell death.

According to the invention, said double-strand break is for: repairing aspecific sequence, modifying a specific sequence, restoring a functionalgene in place of a mutated one, attenuating or activating an endogenousgene of interest, introducing a mutation into a site of interest,introducing an exogenous gene or a part thereof, inactivating ordeleting an endogenous gene or a part thereof, translocating achromosomal arm, or leaving the DNA unrepaired and degraded.

According to another advantageous embodiment of said use, saidheterodimeric meganuclease, polynucleotides, vector, cell, transgenicplant or non-human transgenic mammal are associated with a targeting DNAconstruct as defined above.

The subject-matter of the present invention is also a method of geneticengineering, characterized in that it comprises a step of double-strandnucleic acid breaking in a site of interest located on a vectorcomprising a chimeric DNA target as defined hereabove, by contactingsaid vector with a heterodimeric meganuclease as defined above, therebyinducing a homologous recombination with another vector presentinghomology with the sequence surrounding the cleavage site of saidheterodimeric meganuclease.

The subject-matter of the present invention is also a method of genomeengineering, characterized in that it comprises the following steps: 1)double-strand breaking a genomic locus comprising at least one chimericDNA target of a heterodimeric meganuclease as defined above, bycontacting said target with said heterodimeric meganuclease; 2)maintaining said broken genomic locus under conditions appropriate forhomologous recombination with a targeting DNA construct comprising thesequence to be introduced in said locus, flanked by sequences sharinghomologies with the targeted locus.

The subject-matter of the present invention is also a method of genomeengineering, characterized in that it comprises the following steps: 1)double-strand breaking a genomic locus comprising at least one chimericDNA target of a heterodimeric meganuclease as defined above, bycontacting said cleavage site with said heterodimeric meganuclease; 2)maintaining said broken genomic locus under conditions appropriate forhomologous recombination with chromosomal DNA sharing homologies toregions surrounding the cleavage site.

The subject-matter of the present invention is also a compositioncharacterized in that it comprises at least one heterodimericmeganuclease or two polynucleotides, preferably both included in oneexpression vector or each included in a different expression vector, asdefined above.

In a preferred embodiment of said composition, it comprises a targetingDNA construct comprising the sequence which repairs the site of interestflanked by sequences sharing homologies with the targeted locus.

The subject-matter of the present invention is also the use of at leastone heterodimeric meganuclease or two polynucleotides, preferablyincluded in expression vector(s), as defined above, for the preparationof a medicament for preventing, improving or curing a genetic disease inan individual in need thereof, said medicament being administrated byany means to said individual.

The subject-matter of the present invention is also the use of at leastone heterodimeric meganuclease or two polynucleotides, preferablyincluded in expression vector(s), as defined above for the preparationof a medicament for preventing, improving or curing a disease caused byan infectious agent that presents a DNA intermediate, in an individualin need thereof, said medicament being administrated by any means tosaid individual.

The subject-matter of the present invention is also the use of at leastone heterodimeric meganuclease or two polynucleotides, preferablyincluded in expression vector(s), as defined above, in vitro, forinhibiting the propagation, inactivating or deleting an infectious agentthat presents a DNA intermediate, in biological derived products orproducts intended for biological uses or for disinfecting an object.

In a particular embodiment, said infectious agent is a virus.

In addition to the preceding features, the invention further comprisesother features which will emerge from the description which follows,which refers to examples illustrating the I-CreI meganuclease variantsand their uses according to the invention, as well as to the appendeddrawings in which:

FIG. 1 illustrates the rationale of the experiments. (a) Structure ofI-CreI bound to its DNA target. (b) Zoom of the structure showingresidues 44, 68, 70 chosen for randomization, D75 and interacting basepairs. (c) Design of the library and targets. The interactions of I-CreIresidues Q44, R68 an R70 with DNA targets are indicated (top). Thetarget described here (SEQ ID NO: 1) is a palindrome derived from theI-CreI natural target, and cleaved by I-CreI (Chevalier et al., 2003,precited). Cleavage positions are indicated by arrowheads. In thelibrary, residues 44, 68 and 70 are replaced with ADEGHKNPQRST. SinceI-CreI is an homodimer, the library was screened with palindromictargets. Sixty four palindromic targets resulting from substitutions inpositions ±3, ±4 and ±5 were generated. A few examples of such targetsare shown (bottom; SEQ ID NO: 10 to 16)

FIG. 2 illustrates the screening of the variants. (a) Yeast screeningassay principle. A strain harboring the expression vector encoding thevariants is mated with a strain harboring a reporter plasmid. In thereporter plasmid, a LacZ reporter gene is interrupted with an insertcontaining the site of interest, flanked by two direct repeats. Uponmating, the endonuclease (gray oval) performs a double strand break onthe site of interest, allowing restoration of a functional LacZ (whiteoval) gene by single strand annealing (SSA) between the two flankingdirect repeats. (b) Scheme of the experiment. A library of I-CreIvariants is built using PCR, cloned into a replicative yeast expressionvector and transformed in S. cerevisiae strain FYC2-6A (MATα; trp1Δ63,leu2Δ1, his3Δ200). The 64 palindromic targets are cloned in theLacZ-based yeast reporter vector, and the resulting clones transformedinto strain FYBL2-7B (MATa, ura3Δ851, trp1Δ63, leu2Δ1, lys2Δ202).Robot-assisted gridding on filter membrane is used to perform matingbetween individual clones expressing meganuclease variants andindividual clones harboring a reporter plasmid. After primary highthroughput screening, the ORF of positive clones are amplified by PCRand sequenced. 410 different variants were identified among the 2100positives, and tested at low density, to establish complete patterns,and 350 clones were validated. Also, 294 mutants were recloned in yeastvectors, and tested in a secondary screen, and results confirmed thoseobtained without recloning. Chosen clones are then assayed for cleavageactivity in a similar CHO-based assay and eventually in vitro.

FIG. 3 illustrates the cleavage patterns of the variants. Mutants areidentified by three letters, corresponding to the residues in positions44, 68 and 70. Each mutant is tested versus the 64 targets derived fromthe I-CreI natural targets, and a series of control targets. Target mapis indicated in the top right panel. (a) Cleavage patterns in yeast(left) and mammalian cells (right) for the I-CreI protein, and 8derivatives. For yeast, the initial raw data (filter) is shown. For CHOcells, quantitative raw data (ONPG measurement) are shown, valuessuperior to 0.25 are boxed, values superior to 0.5 are highlighted inmedium grey, values superior to 1 in dark grey. LacZ: positive control.0: no target. U1, U2 and U3: three different uncleaved controls. (b)Cleavage in vitro. I-CreI and four mutants are tested against a set of 2or 4 targets, including the target resulting in the strongest signal inyeast and CHO. Digests are performed at 37° C. for 1 hour, with 2 nMlinearized substrate, as described in Methods. Raw data are shown forI-CreI with two different targets. With both GGG and CCT, cleavage isnot detected with I-CreI.

FIG. 4 represents the statistical analysis. (a) Cleaved targets: targetscleaved by I-CreI variants are colored in grey. The number of proteinscleaving each target is shown below, and the level of grey coloration isproportional to the average signal intensity obtained with these cuttersin yeast. (b) Analysis of 3 out of the 7 clusters. For each mutantcluster (clusters 1, 3 and 7), the cumulated intensities for each targetwas computed and a bar plot (left column) shows in decreasing order thenormalized intensities. For each cluster, the number of amino acid ofeach type at each position (44, 68 and 70) is shown as a coded histogramin the right column. The legend of amino-acid color code is at thebottom of the figure. (b) Hierarchical clustering of mutant and targetdata in yeast. Both mutants and targets were clustered usinghierarchical clustering with Euclidean distance and Ward's method (Ward,J. H., American statist. Assoc., 1963, 58, 236-244). Clustering was donewith hclust from the R package. Mutants and targets dendrograms werereordered to optimize positions of the clusters and the mutantdendrogram was cut at the height of 8 with deduced clusters. QRR mutantand GTC target are indicated by an arrow. Gray levels reflects theintensity of the signal.

FIG. 5 illustrates an example of hybrid or chimeric site: gtt (SEQ IDNO: 17) and cct (SEQ ID NO: 9) are two palindromic sites derived fromthe I-CreI site. The gtt/cct hybrid site (SEQ ID NO: 18) displays thegtt sequence on the top strand in −5, −4, −3 and the cct sequence on thebottom strand in 5, 4, 3.

FIG. 6 illustrates the cleavage activity of the heterodimeric variants.Yeast were co-transformed with the KTG and QAN variants. Targetorganization is shown on the top panel: target with a single gtt, cct orgcc half site are in bold; targets with two such half sites, which areexpected to be cleaved by homo- and/or heterodimers, are in bold andhighlighted in grey; 0: no target. Results are shown on the three panelsbelow. Unexpected faint signals are observed only for gtc/cct andgtt/gtc, cleaved by KTG and QAN, respectively.

FIG. 7 represents the quantitative analysis of the cleavage activity ofthe heterodimeric variants. (a) Co-transformation of selected mutants inyeast. For clarity, only results on relevant hybrid targets are shown.The aac/acc target is always shown as an example of unrelated target.For the KTGxAGR couple, the palindromic tac and tct targets, althoughnot shown, are cleaved by AGR and KTG, respectively. Cleavage of the cattarget by the RRN mutant is very low, and could not be quantified inyeast. (b) Transient co-transfection in CHO cells. For (a) and (b),Black bars: signal for the first mutant alone; grey bars: signal for thesecond mutant alone; striped bars: signal obtained by co-expression orcotransfection.

FIG. 8 illustrates the activity of the assembled heterodimer ARS-KRE onthe selected mouse chromosome 17 DNA target. CHO-K1 cell line wereco-transfected with equimolar of target LagoZ plasmid, ARS and KREexpression plasmids, and the beta galactosidase activity was measured.Cells co-transfected with the LagoZ plasmid and the I-SceI, I-CreI, ARSor KRE recombinant plasmid or an empty plasmid were used as control.

EXAMPLE 1 Screening for New Functional Endonucleases

The method for producing meganuclease variants and the assays based oncleavage-induced recombination in mammal or yeast cells, which are usedfor screening variants with altered specificity, are described in theInternational PCT Application WO 2004/067736. These assays result in afunctional LacZ reporter gene which can be monitored by standard methods(FIG. 2 a).

A) Material and Methods a) Construction of Mutant Libraries

I-CreI wt and I-CreI D75N open reading frames were synthesized, asdescribed previously (Epinat et al., N.A.R., 2003, 31, 2952-2962).Mutation D75N was introduced by replacing codon 75 with AAC. Thediversity of the meganuclease library was generated by PCR usingdegenerate primers from Sigma harboring codon VVK (18 codons, aminoacids ADEGHKNPQRST) at position 44, 68 and 70 which interact directlywith the bases at positions 3 to 5, and as DNA template, the I-CreIgene. The final PCR product was digested with specific restrictionenzymes, and cloned back into the I-CreI ORF digested with the samerestriction enzymes, in pCLS542. In this 2 micron-based replicativevector marked with the LEU2 gene, I-CreI variants are under the controlof a galactose inducible promoter (Epinat et al., precited). Afterelectroporation in E. coli, 7×10⁴ clones were obtained 7×10⁴ clones,representing 12 times the theoretical diversity at the DNA level(18³=5832). DNA was extracted and transformed into S. cerevisiae strainFYC2-6A (MATα, trp1Δ63, leu2Δ1, his3Δ200). 13824 colonies were pickedusing a colony picker (QpixII, GENETIX), and grown in 144 microtiterplates.

b) Construction of Target Clones

The C1221 twenty-four bp palindrome (tcaaaacgtcgtacgacgttttga, SEQ IDNO: 1) is a repeat of the half-site of the nearly palindromic naturalI-CreI target (tcaaaacgtcgtgagacagtttgg, SEQ ID NO: 3). C1221 is cleavedas efficiently as the I-CreI natural target in vitro and ex vivo in bothyeast and mammalian cells. The 64 palindromic targets were derived asfollows: 64 pair of oligonucleotides(ggcatacaagtttcaaaacnnngtacnnngttttgacaatcgtctgtca (SEQ ID NO: 4) andreverse complementary sequences) were ordered form Sigma, annealed andcloned into pGEM-T Easy (PROMEGA). Next, a 400 bp PvuII fragment wasexcised and cloned into the yeast vector pFL39-ADH-LACURAZ, describedpreviously (Epinat et al., precited).

c) Mating of Meganuclease Expressing Clones and Screening in Yeast

Mating was performed using a colony gridder (QpixII, GENETIX). Mutantswere gridded on nylon filters covering YPD plates, using a high griddingdensity (about 20 spots/cm²). A second gridding process was performed onthe same filters to spot a second layer consisting of 64 or 75 differentreporter-harboring yeast strains for each variant. Membranes were placedon solid agar YPD rich medium, and incubated at 30° C. for one night, toallow mating. Next, filters were transferred to synthetic medium,lacking leucine and tryptophan, with galactose (1%) as a carbon source(and with G418 for coexpression experiments), and incubated for fivedays at 37° C., to select for diploids carrying the expression andtarget vectors. After 5 days, filters were placed on solid agarosemedium with 0.02% X-Gal in 0.5 M sodium phosphate buffer, pH 7.0, 0.1%SDS, 6% dimethyl formamide (DMF), 7 mM β-mercaptoethanol, 1% agarose,and incubated at 37° C., to monitor β-galactosidase activity. Resultswere analyzed by scanning and quantification was performed using aproprietary software.

d) Sequence and Re-Cloning of Primary Hits

The open reading frame (ORF) of positive clones identified during theprimary screening in yeast was amplified by PCR and sequenced. Then,ORFs were recloned using the Gateway protocol (Invitrogen). ORFs wereamplified by PCR on yeast colonies (Akada et al., Biotechniques, 28,668-670, 672-674), using primers:ggggacaagtttgtacaaaaaagcaggcttcgaaggagatagaaccatggccaataccaaatataacaaagagttcc(SEQ ID NO: 5) andggggaccactttgtacaagaaagctgggtttaagtcggccgccggggaggatttcttctttctcgc (SEQID NO: 6) from PROLIGO. PCR products were cloned in: (i) yeast gatewayexpression vector harboring a galactose inducible promoter, LEU2 or KanRas selectable marker and a 2 micron origin of replication, and (ii) apET 24d(+) vector from NOVAGEN. Resulting clones were verified bysequencing (MILLEGEN).

B) Results

I-CreI is a dimeric homing endonuclease that cleaves a 22 bppseudo-palindromic target. Analysis of I-CreI structure bound to itsnatural target has shown that in each monomer, eight residues establishdirect interactions with seven bases (Jurica et al., 1998, precited).Residues Q44, R68, R70 contact three consecutive base pairs at position3 to 5 (and −3 to −5, FIG. 1). An exhaustive protein library vs. targetlibrary approach was undertaken to engineer locally this part of the DNAbinding interface. First, the I-CreI scaffold was mutated from D75 to Nto decrease likely energetic strains caused by the replacement of thebasic residues R68 and R70 in the library that satisfy thehydrogen-acceptor potential of the buried D75 in the I-CreI structure.The D75N mutation did not affect the protein structure, but decreasedthe toxicity of I-CreI in overexpression experiments. Next, positions44, 68 and 70 were randomized and 64 palindromic targets resulting fromsubstitutions in positions ±3, ±4 and ±5 of a palindromic target cleavedby I-CreI (Chevalier et al., 2003, precited) were generated, asdescribed in FIG. 1.

A robot-assisted mating protocol was used to screen a large number ofmeganucleases from our library. The general screening strategy isdescribed in FIG. 2 b. 13,824 meganuclease expressing clones (about2.3-fold the theoretical diversity) were spotted at high density (20spots/cm²) on nylon filters and individually tested against each one ofthe 64 target strains (884,608 spots). 2100 clones showing an activityagainst at least one target were isolated (FIG. 2 b) and the ORFencoding the meganuclease was amplified by PCR and sequenced. 410different sequences were identified and a similar number ofcorresponding clones were chosen for further analysis. The spottingdensity was reduced to 4 spots/cm² and each clone was tested against the64 reporter strains in quadruplicate, thereby creating complete profiles(as in FIG. 3 a). 350 positives could be confirmed. Next, to avoid thepossibility of strains containing more than one clone, mutant ORFs wereamplified by PCR, and recloned in the yeast vector. The resultingplasmids were individually transformed back into yeast. 294 such cloneswere obtained and tested at low density (4 spots/cm²). Differences withprimary screening were observed mostly for weak signals, with 28 weakcleavers appearing now as negatives. Only one positive clone displayed apattern different from what was observed in the primary profiling.

The 350 validated clones showed very diverse patterns. Some of these newprofiles shared some similarity with the wild type scaffold whereas manyothers were totally different. Various examples are shown on FIG. 3 a.Homing endonucleases can usually accommodate some degeneracy in theirtarget sequences, and one of our first findings was that the originalI-CreI protein itself cleaves seven different targets in yeast. Many ofour mutants followed this rule as well, with the number of cleavedsequences ranging from 1 to 21 with an average of 5.0 sequences cleaved(standard deviation=3.6). Interestingly, in 50 mutants (14%),specificity was altered so that they cleaved exactly one target. 37(11%) cleaved 2 targets, 61 (17%) cleaved 3 targets and 58 (17%) cleaved4 targets. For 5 targets and above, percentages were lower than 10%.Altogether, 38 targets were cleaved by the mutants (FIG. 4 a). It isnoteworthy that cleavage was barely observed on targets with an A inposition ±3, and never with targets with TGN and CGN at position +5, +4,±3.

EXAMPLE 2 Novel Meganucleases can Cleave Novel Targets while KeepingHigh Activity and Narrow Specificity A) Material and Methods a)Construction of Target Clones

The 64 palindromic targets were cloned into pGEM-T Easy (PROMEGA), asdescribed in example 1. Next, a 400 bp PvuII fragment was excised andcloned into the mammalian vector pcDNA3.1-LACURAZ-ΔURA, describedpreviously (Epinat et al., precited). The 75 hybrid targets sequenceswere cloned as follows: oligonucleotides were designed that containedtwo different half sites of each mutant palindrome (PROLIGO).

b) Re-Cloning of Primary Hits

The open reading frame (ORF) of positive clones identified during theprimary screening in yeast was recloned in: (i) a CHO gateway expressionvector pCDNA6.2, following the instructions of the supplier(INVITROGEN), and ii) a pET 24d(+) vector from NOVAGEN Resulting cloneswere verified by sequencing (MILLEGEN).

c) Mammalian Cells Assay

CHO-K1 cell line from the American Type Culture Collection (ATCC) wascultured in Ham'sF12K medium supplemented with 10% Fetal Bovine Serum.For transient Single Strand Annealing (SSA) assays, cells were seeded in12 well-plates at 13.10³ cells per well one day prior transfection.Cotransfection was carried out the following day with 400 ng of DNAusing the EFFECTENE transfection kit (QIAGEN). Equimolar amounts oftarget LagoZ plasmid and expression plasmid were used. The next day,medium was replaced and cells were incubated for another 72 hours.CHO-K1 cell monolayers were washed once with PBS. The cells were thenlysed with 150 μl of lysis/revelation buffer added for β-galactosidaseliquid assay (100 ml of lysis buffer (Tris-HCl 10 mM pH7.5, NaCl 150 mM,Triton X100 0.1%, BSA 0.1 mg/ml, protease inhibitors) and 900 ml ofrevelation buffer (10 ml of Mg 100× buffer (MgCl₂ 100 mM,β-mercaptoethanol 35%), 110 ml ONPG (8 mg/ml) and 780 ml of sodiumphosphate 0.1 M pH7.5), 30 minutes on ice. Beta-galactosidase activitywas assayed by measuring optical density at 415 nm. The entire processwas performed on an automated Velocity 11 BioCel platform. Thebeta-galactosidase activity is calculated as relative units normalizedfor protein concentration, incubation time and transfection efficiency.

d) Protein Expression and Purification

His-tagged proteins were over-expressed in E. coli BL21 (DE3)pLysS cellsusing pET-24d (+) vectors (NOVAGEN). Induction with IPTG (0.3 mM), wasperformed at 25° C. Cells were sonicated in a solution of 50 mM SodiumPhosphate (pH 8), 300 mM sodium chloride containing protease inhibitors(Complete EDTA-free tablets, Roche) and 5% (v/v) glycerol. Cell lysateswere centrifuged at 100000 g for 60 min. His-tagged proteins were thenaffinity-purified, using 5 ml Hi-Trap chelating HP columns (AmershamBiosciences) loaded with cobalt. Several fractions were collected duringelution with a linear gradient of imidazole (up to 0.25M imidazole,followed by plateau at 0.5 M imidazole, 0.3 M NaCl and 50 mM SodiumPhosphate pH 8). Protein-rich fractions (determined by SDS-PAGE) wereapplied to the second column. The crude purified samples were taken topH 6 and applied to a 5 ml HiTrap Heparin HP column (AmershamBiosciences) equilibrated with 20 mM Sodium Phosphate pH 6.0. Boundproteins are eluted with a sodium chloride continuous gradient with 20mM sodium phosphate and 1M sodium chloride. The purified fractions weresubmitted to SDS-PAGE and concentrated (10 kDa cut-off centriprep AmiconUltra system), frozen in liquid nitrogen and stored at −80° C. Purifiedproteins were desalted using PD10 columns (Sephadex G-25M, AmershamBiosciences) in PBS or 10 mM Tris-HCl (pH 8) buffer.

e) In Vitro Cleavage Assays

pGEM plasmids with single meganuclease DNA target cut sites were firstlinearized with XmnI. Cleavage assays were performed at 37° C. in 10 mMTris-HCl (pH 8), 50 mM NaCl, 10 mM MgCl2, 1 mM DTT and 50 μg/ml BSA. 2nM was used as target substrate concentration. A dilution range between0 and 85 nM was used for each protein, in 25 μl final volume reaction.Reactions were stopped after 1 hour by addition of 5 μl of 45% glycerol,95 mM EDTA (pH 8), 1.5% (w/v) SDS, 1.5 mg/ml proteinase K and 0.048%(w/v) bromophenol blue (6× Buffer Stop) and incubated at 37° C. for 30minutes. Digests were run on agarosse electrophoresis gel, and fragmentquantified after ethidium bromide staining, to calculate the percentageof cleavage.

B) Results

Eight representative mutants (belonging to 6 different clusters, seebelow) were chosen for further characterization (FIG. 3). First, data inyeast were confirmed in mammalian cells, by using an assay based on thetransient cotransfection of a meganuclease expressing vector and atarget vector, as described in a previous report. The 8 mutant ORFs andthe 64 targets were cloned into appropriate vectors, and arobot-assisted microtiter-based protocol was used to co-transfect in CHOcells each selected variant with each one the 64 different reporterplasmids. Meganuclease-induced recombination was measured by a standard,quantitative ONPG assay that monitors the restoration of a functionalβ-galactosidase gene. Profiles were found to be qualitatively andquantitatively reproducible in five independent experiments. As shown onFIG. 3 a, strong and medium signals were nearly always observed withboth yeast and CHO cells (with the exception of ADK), thereby validatingthe relevance of the yeast HTS process. However, weak signals observedin yeast were often not detected in CHO cells, likely due to adifference in the detection level (see QRR and targets gtg, gct, andttc). Four mutants were also produced in E. coli and purified by metalaffinity chromatography. Their relative in vitro cleavage efficienciesagainst the wild-type site and their cognate sites was determined. Theextent of cleavage under standardized conditions was assessed across abroad range of concentrations for the mutants (FIG. 3 b). Similarly, theactivity of I-CreI wt on these targets, was analysed. In many case, 100%cleavage of the substrate could not be achieved, likely reflecting thefact that these proteins may have little or no turnover (Perrin et al.,EMBO J., 1993, 12, 2939-2947; Wang et al., Nucleic Acids Res., 1997, 25,3767-3776). In general, in vitro assay confirmed the data obtained inyeast and CHO cells, but surprinsingly, the gtt target was efficientlycleaved by I-CreI

Specificity shifts were obvious from the profiles obtained in yeast andCHO: the I-CreI favorite gtc target was not cleaved or barely cleaved,while signals were observed with new targets. This switch of specificitywas confirmed for QAN, DRK, RAT and KTG by in vitro analysis, as shownon FIG. 3 b. In addition, these four mutants, which display variouslevels of activity in yeast and CHO (FIG. 3 a) were shown to cleave17-60% of their favorite target in vitro (FIG. 3 b), with similarkinetics to I-CreI (half of maximal cleavage by 13-25 nM). Thus,activity was largely preserved by engineering. Third, the number ofcleaved targets varied among the mutants: strong cleavers such as QRR,QAN, ARL and KTG have a spectrum of cleavage in the range of what isobserved with I-CreI (5-8 detectable signals in yeast, 3-6 in CHO).Specificity is more difficult to compare with mutants that cleaveweakly. For example, a single weak signal is observed with DRK but mightrepresent the only detectable signal resulting from the attenuation of amore complex pattern. Nevertheless, the behavior of variants that cleavestrongly shows that engineering preserves a very narrow specificity.

EXAMPLE 3 Hierarchical Clustering Defines Seven I-CreI Variant FamiliesA) Material and Methods

Clustering was done using hclust from the R package. We usedquantitative data from the primary, low density screening. Both variantsand targets were clustered using standard hierarchical clustering withEuclidean distance and Ward's method (Ward, J. H., American Stat.Assoc., 1963, 58, 236-244). Mutants and targets dendrograms werereordered to optimize positions of the clusters and the mutantdendrogram was cut at the height of 8 to define the cluster.

B) Results

Next, hierarchical clustering was used to determine whether familiescould be identified among the numerous and diverse cleavage patterns ofthe variants. Since primary and secondary screening gave congruentresults, quantitative data from the first round of yeast low densityscreening was used for analysis, to permit a larger sample size. Bothvariants and targets were clustered using standard hierarchicalclustering with Euclidean distance and Ward's method (Ward, J. H.,precited) and seven clusters were defined (FIG. 4 b). Detailed analysisis shown for 3 of them (FIG. 4 c) and the results are summarized inTable I.

TABLE I Cluster Analysis Nucleotide Three preferred in examples targets¹position 4 preferred amino acid² cluster (FIG. 3a) sequence % cleavage(%)¹ 44 68 70 1 QAN gtt 46.2 g 0.5 Q 77 proteins gtc 18.3 a 2.0 80.5%gtg 13.6 t 82.4 (62/77) Σ = 78.1 c 15.1 2 QRR gtt 13.4 g 0 Q R  8proteins gtc 11.8 a 4.9 100.0%  100.0%  tct 11.4 t 56.9 (8/8) (8/8) Σ =36.6 c 38.2 3 ARL gat 27.9 g 2.4 A R 65 proteins tat 23.2 a 88.9 63.0%33.8% gag 15.7 t 5.7 (41/65) (22/65) Σ = 66.8 c 3.0 4 AGR gac 22.7 g 0.3A&N R R 31 proteins tac 14.5 a 91.9 51.6% & 48.4% 67.7% gat 13.4 t 6.635.4% 15/31 21/31 Σ = 50.6 c 1.2 (16&11/31) 5 ADK gat  29.21 g 1.6 81proteins DRK tat 15.4 a 73.8 gac 11.4 t 13.4 Σ = 56.05.9 c 11.2 6 KTGcct 30.1 g 0 K 51 proteins RAT tct 19.6 a 4.0 62.7% tcc 13.9 t 6.3(32/51) Σ = 63.6 c 89.7 7 cct 20.8 g 0 K 37 proteins tct 19.6 a 0.291.9% tcc 15.3 t 14.4 (34/37) Σ = 55.7 c 85.4 ¹frequencies according tothe cleavage index, as described in FIG. 4c ²in each position, residuespresent in more than ⅓ of the cluster are indicated

For each cluster, a set of preferred targets could be identified on thebasis of the frequency and intensity of the signal (FIG. 4 c). The threepreferred targets for each cluster are indicated in Table 1, with theircleavage frequencies. The sum of these frequencies is a measurement ofthe specificity of the cluster. For example, in cluster 1, the threepreferred targets (gtt/c/g), account for 78.1% of the observed cleavage,with 46.2% for gtt alone, revealing a very narrow specificity. Actually,this cluster includes several proteins which, as QAN, which cleavesmostly gtt (FIG. 3 a). In contrast, the three preferred targets incluster 2 represent only 36.6% of all observed signals. In accordancewith the relatively broad and diverse patterns observed in this cluster,QRR cleaves 5 targets (FIG. 3 a), while other cluster members' activityare not restricted to these 5 targets.

Analysis of the residues found in each cluster showed strong biases forposition 44: Q is overwhelmingly represented in clusters 1 and 2,whereas A and N are more frequent in clusters 3 and 4, and K in clusters6 and 7. Meanwhile, these biases were correlated with strong basepreferences for DNA positions ±4, with a large majority of t:a basepairs in cluster 1 and 2, a:t in clusters 3, 4 and 5, and c:g inclusters 6 and 7 (see Table I). The structure of I-CreI bound to itstarget shows that residue Q44 interacts with the bottom strand inposition −4 (and the top strand of position +4, see FIGS. 1 b and 1 c).These results suggests that this interaction is largely conserved in ourmutants, and reveals a “code”, wherein Q44 would establish contact withadenine, A44 (or less frequently N44) with thymine, and K44 withguanine. Such correlation was not observed for positions 68 and 70.

EXAMPLE 4 Variants can be Assembled in Functional Heterodimers to CleaveNew DNA Target Sequences A) Materials and Methods

The 75 hybrid targets sequences were cloned as follows:oligo-nucleotides were designed that contained two different half sitesof each mutant palindrome (PROLIGO). Double-stranded target DNA,generated by PCR amplification of the single stranded oligonucleotides,was cloned using the Gateway protocol (INVITROGEN) into yeast andmammalian reporter vectors. Yeast reporter vectors were transformed intoS. cerevisiae strain FYBL2-7B (MATα, ura3×851, trp1Δ63, leu2Δ1,lys2Δ202).

B) Results

Variants are homodimers capable of cleaving palindromic sites. To testwhether the list of cleavable targets could be extended by creatingheterodimers that would cleave hybrid cleavage sites (as described inFIG. 5), a subset of I-CreI variants with distinct profiles was chosenand cloned in two different yeast vectors marked by LEU2 or K4N genes.Combinations of mutants having mutations at positions 44, 68 and/or 70and N at position 75, were then co-expressed in yeast with a set ofpalindromic and non palindromic chimeric DNA targets. An example isshown on FIG. 2: co-expression of the K44, T68, G70, N75 (KTG) and Q44,A68, N70, N75 (QAN) mutants resulted in the cleavage of two chimerictargets, gtt/gcc and gtt/cct, that were not cleaved by either mutantalone. The palindromic gtt, cct and gcc targets (and other targets ofKTG and QAN) were also cleaved, likely resulting from homodimericspecies formation, but unrelated targets were not. In addition, a gtt,cct or gcc half-site was not sufficient to allow cleavage, since suchtargets were fully resistant (see ggg/gcc, gat/gcc, gcc/tac, and manyothers, on FIG. 6). Unexpected cleavage was observed only with gtc/cctand gtt/gtc, with KTG and QAN homodimers, respectively, but signalremained very weak. Thus, efficient cleavage requires the cooperativebinding of two mutant monomers. These results demonstrate a good levelof specificity for heterodimeric species.

Altogether, a total of 112 combinations of 14 different proteins weretested in yeast, and 37.5% of the combinations (42/112) revealed apositive signal on their predicted chimeric target. Quantitative dataare shown for six examples on FIG. 7 a, and for the same sixcombinations, results were confirmed in CHO cells in transientco-transfection experiments, with a subset of relevant targets (FIG. 7b). As a general rule, functional heterodimers were always obtained whenone of the two expressed proteins gave a strong signal as homodimer. Forexample, DRN and RRN, two low activity mutants, give functionalheterodimers with strong cutters such as KTG or QRR (FIGS. 7 a and 7 b)whereas no cleavage of chimeric targets could be detected byco-expression of the same weak mutants

EXAMPLE 5 Cleavage of a Natural DNA Target by Assembled Heterodimer A)Materials and Methods a) Genome Survey

A natural target potentially cleaved by a I-CreI variant, was identifiedby scanning the public databases, for genomic sequences matching thepattern caaaacnnnnnnnnnnnngttttg, wherein n is a, t, c, or g (SEQ ID NO:2). The natural target DNA sequence caaaactatgtagagggttttg (SEQ ID NO:7) was identified in mouse chromosome 17.

This DNA sequence is potentially cleaved by a combination of two I-CreIvariants cleaving the sequences tcaaaactatgtgatagttttga (SEQ ID NO: 8)and tcaaaaccctgtgaagggttttga (SEQ ID NO: 9), respectively.

b) Isolation of Meganuclease Variants

Variants were selected by the cleavage-induced recombination assay inyeast, as described in example 1, using the sequencetcaaaactatgtgaatagttttga (SEQ ID NO: 8) or the sequencetcaaaaccctgtgaaggggttttga (SEQ ID NO: 9) as targets.

c) Construction of the Target Plasmid

Oligonucleotides were designed that contained two different half sitesof each mutant palindrome (PROLIGO). Double-stranded target DNA,generated by PCR amplification of the single stranded oligonucleotides,was cloned using the Gateway protocol (INVITROGEN) into the mammalianreporter vector pcDNA3.1-LACURAZ-ΔURA, described previously (Epinat etal., precited), to generate the target LagoZ plasmid.

d) Construction of Meganuclease Expression Vector

The open reading frames (ORFs) of the clones identified during thescreening in yeast were amplified by PCR on yeast colony and clonedindividually in the CHO expression vector pCDNA6.2 (INVITROGEN), asdescribed in example 1. I-CreI variants were expressed under the controlof the CMV promoter.

e) Mammalian Cells Assay

CHO-K1 cell line were transiently co-transfected with equimolar amountsof target LagoZ plasmid and expression plasmids, and the betagalactosidase activity was measured as described in examples 2 and 4.

B) Results

A natural DNA target, potentially cleaved by I-CreI variants wasidentified by performing a genome survey of sequences matching thepattern caaaacnnnnnnnnnnnngttttg (SEQ ID NO: 2). A randomly chosen DNAsequence (SEQ ID NO: 2) identified in chromosome 17 of the mouse wascloned into a reporter plasmid. This DNA target was potentially cleavedby a combination of the I-CreI variants A44,R68,S70,N75 (ARS) andK44,R68,E70,N75 (KRE).

The co-expression of these two variants in CHO cell leads to theformation of functional heterodimer protein as shown in FIG. 8. Indeedwhen the I-CreI variants were expressed individually, virtually nocleavage activity could be detected on the mouse DNA target although theKRE protein showed a residual activity. In contrast, when these twovariants were co-expressed together with the plasmid carrying thepotential target, a strong beta-galactosidase activity could bemeasured. All together these data revealed that heterodimerizationoccurred in the CHO cells and that heterodimers were functional.

These data demonstrate that heterodimers proteins created by assemblinghomodimeric variants, extend the list of natural occurring DNA targetsequences to all the potential hybrid cleavable targets resulting fromall possible combination of the variants.

Moreover, these data demonstrated that it is possible to predict the DNAsequences that can be cleaved by a combination of variant knowing theirindividual DNA target of homodimer. Furthermore, the nucleotides atpositions 1 et 2 (and −1 and −2) of the target can be different fromgtac, indicating that they play little role in DNA/protein interaction.

1-38. (canceled)
 39. A recombinant heterodimeric meganuclease comprisingtwo separate polypeptides, wherein each of said polypeptides comprises aLAGLIDADG (SEQ ID NO: 20) Homing Endonuclease core domain and each ofsaid polypeptides is a different LAGLIDADG (SEQ ID NO: 20) HomingEndonuclease I-CreI, at least one of said polypeptides having a lysinein position 44 of I-Crel amino acid sequence according to the amino acidnumbering of the I-CreI sequence of SWISSPROT accession number PO5725(SEQ ID NO: 21), and wherein said two separate polypeptides are able toassemble and to cleave a chimeric DNA target sequence selected from thegroup consisting of:c⁻¹¹a⁻¹⁰a⁻⁹a⁻⁸a⁻⁷c⁻⁶n⁻⁵c⁻⁴n⁻³n⁻²n⁻¹n₊₁n₊₂n₊₃n₊₄n₊₅g₊₆t₊₇t₊₈t₊₉t₊₁₀g₊₁₁(SEQID NO: 22); andc⁻¹¹a⁻¹⁰a⁻⁹a⁻⁸a⁻⁷c⁻⁶n⁻⁵n⁻⁴n⁻³n⁻²n⁻¹n₊₁n₊₂n₊₃g₊₄n₊₅g₊₆t₊₇t₊₈t₊₉t₊₁₀g₊₁₁(SEQID NO: 23), wherein n is a, t, c, or g.
 40. A composition comprising atleast one recombinant heterodimeric meganuclease according to claim 39.41. The composition according to claim 40, further comprising atargeting DNA construct comprising the sequence which repairs the siteof interest flanked by sequence sharing homologies with the targetedlocus.
 42. The recombinant heterodimeric meganuclease of claim 39,wherein one of the polypeptides comprising a LAGLIDADG (SEQ ID NO:20)homing endonuclease core domain, comprises the wild type I-CreI sequenceof SEQ ID NO:21.